A spectral algorithm for fast de novo layout of uncorrected long nanopore reads

نویسندگان

  • Antoine Recanati
  • Thomas Brüls
  • Alexandre d'Aspremont
چکیده

Motivation New long read sequencers promise to transform sequencing and genome assembly by producing reads tens of kilobases long. However, their high error rate significantly complicates assembly and requires expensive correction steps to layout the reads using standard assembly engines. Results We present an original and efficient spectral algorithm to layout the uncorrected nanopore reads, and its seamless integration into a straightforward overlap/layout/consensus (OLC) assembly scheme. The method is shown to assemble Oxford Nanopore reads from several bacterial genomes into good quality (∼99% identity to the reference) genome-sized contigs, while yielding more fragmented assemblies from the eukaryotic microbe Sacharomyces cerevisiae. Availability and implementation https://github.com/antrec/spectrassembler. Contact [email protected]. Supplementary Information Supplementary data are available at Bioinformatics online.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast and accurate de novo genome assembly from long uncorrected reads.

The assembly of long reads from Pacific Biosciences and Oxford Nanopore Technologies typically requires resource-intensive error-correction and consensus-generation steps to obtain high-quality assemblies. We show that the error-correction step can be omitted and that high-quality consensus sequences can be generated efficiently with a SIMD-accelerated, partial-order alignment-based, stand-alon...

متن کامل

De novo Clustering of Gene Expressed Variants in Transcriptomic Long Reads Data Sets

This work addresses the problem of grouping by genes long reads expressed in a whole transcriptome sequencing data set. Long read sequencing produces several thousands basepair long sequences, although showing high error rate in comparison to short reads. Long reads can cover full-length RNA transcripts and thus are of high interest to complete references. However, the literature is lacking too...

متن کامل

Oxford Nanopore Sequencing and de novo Assembly of a Eukaryotic Genome

Monitoring the progress of DNA through a pore has been postulated as a method for sequencing DNA for several decades 1,2. Recently, a nanopore instrument, the Oxford Nanopore MinION, has become available 3. Here we describe our sequencing of the S. cerevisiae genome. We describe software developed to make use of these data as existing packages were incapable of assembling long reads at such hig...

متن کامل

A sequencer coming of age : genome assembly using De novo

Nanopore technology provides a novel approach to DNA sequencing that yields long, label-free reads of constant quality. The first commercial implementation of this approach, the MinION, has shown promise in various sequencing applications. This review gives an up-to-date overview of the MinION's utility as a sequencing device. It is argued that the MinION de novo may allow for portable and af...

متن کامل

Oxford Nanopore sequencing, hybrid error correction, and de novo assembly of a eukaryotic genome.

Monitoring the progress of DNA molecules through a membrane pore has been postulated as a method for sequencing DNA for several decades. Recently, a nanopore-based sequencing instrument, the Oxford Nanopore MinION, has become available, and we used this for sequencing the Saccharomyces cerevisiae genome. To make use of these data, we developed a novel open-source hybrid error correction algorit...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Bioinformatics

دوره 33 20  شماره 

صفحات  -

تاریخ انتشار 2017